Gesture marking of disfluencies in spontaneous speech
نویسندگان
چکیده
Speakers effectively use both visual and acoustic cues to convey information in speech. While earlier research has concentrated on the association of visual cues (provided by gestures) with fluent prosodic structure, this study looks at the relationship between visual cues, prosodic markers and spoken disfluencies. Preliminary results suggested that speakers preferentially perform gestures in the eye region in spoken disfluencies, but a more careful frame-by-frame analysis capturing all gestures revealed that movements of the eye region (blinks, frowns, eyebrow raises and changes in direction of eyegaze) occur with high frequency in both fluent and non-fluent speech. The paper describes a method for frame-by-frame labelling of speechaccompanying gestures for a speech sample, whose output can then be combined with independently derived labels of the prosody. Initial analysis of 3 minute samples from two speakers reveals that one speaker produces eye movements in association with disfluencies and the other does not, and that this tendency does not result from alignment of brow gestures with pitch accents.
منابع مشابه
Synthesising Filled Pauses: Representation and Datamixing
Filled pauses occur frequently in spontaneous human speech, yet modern text-to-speech synthesis systems rarely model these disfluencies overtly, and consequently they do not output convincing synthetic filled pauses. This paper presents a text-to-speech system that is specifically designed to model these particular disfluencies more efffectively. A preparatory investigation shows that a synthet...
متن کاملGesture production and speech fluency in competent speakers and language learners
It is often assumed that a main function of gestures is to compensate for expressive difficulties. This predicts that gestures should mainly occur with disfluent speech. However, surprisingly little is known about the relationship between gestures and fluent vs. disfluent speech. This study investigates the putative compensatory role of gesture by examining competent speakers’ and language lear...
متن کاملDisfluencies in Switchboard
Disfluencies (“um,” repeats, self-repairs) are prevalent in spontaneous speech, and are relevant to both human speech communication and speech processing by machine. Although disfluencies have commonly been viewed as ‘noisy’ events, results from a large descriptive study indicate that disfluencies show regularities in a number of dimensions [9]. This paper reports selected results on Switchboar...
متن کاملA Corpus of Spontaneous Speech in Lectures: The KIT Lecture Corpus for Spoken Language Processing and Translation
With the increasing number of applications handling spontaneous speech, the needs to process spoken languages become stronger. Speech disfluency is one of the most challenging tasks to deal with in automatic speech processing. As most applications are trained with well-formed, written texts, many issues arise when processing spontaneous speech due to its distinctive characteristics. Therefore, ...
متن کاملCoping with disfluencies in spontaneous speech recognition
Nowadays, automatic speech recognizers have become quite good in recognizing well prepared fluent speech (e.g. news readings). However, the recognition of spontaneous speech is still problematic. Some important reasons for this are that spontaneous speech is usually less articulated and contains a lot of disfluencies. In this paper, a new methodology for coping with disfluencies is presented an...
متن کاملذخیره در منابع من
با ذخیره ی این منبع در منابع من، دسترسی به آن را برای استفاده های بعدی آسان تر کنید
عنوان ژورنال:
دوره شماره
صفحات -
تاریخ انتشار 2005